Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 1599 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 212 |
| Duplicate rows (%) | 13.3% |
| Total size in memory | 150.0 KiB |
| Average record size in memory | 96.1 B |
Variable types
| Numeric | 12 |
|---|
| Dataset has 212 (13.3%) duplicate rows | Duplicates |
fixed acidity is highly correlated with citric acid and 2 other fields | High correlation |
volatile acidity is highly correlated with citric acid | High correlation |
citric acid is highly correlated with fixed acidity and 2 other fields | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
density is highly correlated with fixed acidity | High correlation |
pH is highly correlated with fixed acidity and 1 other fields | High correlation |
fixed acidity is highly correlated with citric acid and 2 other fields | High correlation |
volatile acidity is highly correlated with citric acid | High correlation |
citric acid is highly correlated with fixed acidity and 2 other fields | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
density is highly correlated with fixed acidity | High correlation |
pH is highly correlated with fixed acidity and 1 other fields | High correlation |
fixed acidity is highly correlated with pH | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
pH is highly correlated with fixed acidity | High correlation |
fixed acidity is highly correlated with citric acid and 3 other fields | High correlation |
citric acid is highly correlated with fixed acidity and 3 other fields | High correlation |
residual sugar is highly correlated with density | High correlation |
chlorides is highly correlated with citric acid and 1 other fields | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
density is highly correlated with fixed acidity and 3 other fields | High correlation |
pH is highly correlated with fixed acidity and 2 other fields | High correlation |
sulphates is highly correlated with citric acid and 1 other fields | High correlation |
alcohol is highly correlated with fixed acidity and 1 other fields | High correlation |
Reproduction
| Analysis started | 2023-04-09 20:20:27.439721 |
|---|---|
| Analysis finished | 2023-04-09 20:21:12.677995 |
| Duration | 45.24 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 96 |
|---|---|
| Distinct (%) | 6.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.599372207 × 10-16 |
| Minimum | -2.137044857 |
|---|---|
| Maximum | 4.355149169 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 969 |
| Negative (%) | 60.6% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -2.137044857 |
|---|---|
| 5-th percentile | -1.27524919 |
| Q1 | -0.7007187452 |
| median | -0.2410943893 |
| Q3 | 0.505795189 |
| 95-th percentile | 1.999574346 |
| Maximum | 4.355149169 |
| Range | 6.492194027 |
| Interquartile range (IQR) | 1.206513934 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 2.779131428 × 1015 |
| Kurtosis | 1.132143398 |
| Mean | 3.599372207 × 10-16 |
| Median Absolute Deviation (MAD) | 0.5745304448 |
| Skewness | 0.9827514413 |
| Sum | 5.684341886 × 10-13 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.6432657007 | 67 | 4.2% |
| -0.7007187452 | 57 | 3.6% |
| -0.2985474338 | 53 | 3.3% |
| -0.4709065672 | 52 | 3.3% |
| -0.7581717897 | 50 | 3.1% |
| -0.3560004783 | 49 | 3.1% |
| -0.8730778786 | 46 | 2.9% |
| -0.4134535228 | 46 | 2.9% |
| -0.06873525585 | 45 | 2.8% |
| -0.5283596117 | 44 | 2.8% |
| Other values (86) | 1090 |
| Value | Count | Frequency (%) |
| -2.137044857 | 1 | 0.1% |
| -2.079591813 | 1 | 0.1% |
| -1.964685724 | 1 | 0.1% |
| -1.907232679 | 6 | |
| -1.849779635 | 4 | 0.3% |
| -1.79232659 | 6 | |
| -1.734873546 | 4 | 0.3% |
| -1.677420501 | 5 | 0.3% |
| -1.619967457 | 1 | 0.1% |
| -1.562514412 | 14 |
| Value | Count | Frequency (%) |
| 4.355149169 | 1 | |
| 4.182790036 | 2 | |
| 4.125336992 | 2 | |
| 3.838071769 | 2 | |
| 3.435900458 | 1 | |
| 3.263541324 | 1 | |
| 3.148635235 | 1 | |
| 3.091182191 | 2 | |
| 2.976276102 | 1 | |
| 2.918823057 | 1 |
| Distinct | 144 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -2.843948411 × 10-16 |
| Minimum | -2.818740701 |
|---|---|
| Maximum | 5.692614089 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 819 |
| Negative (%) | 51.2% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -2.818740701 |
|---|---|
| 5-th percentile | -1.418138014 |
| Q1 | -0.7178366708 |
| median | -0.01753532733 |
| Q3 | 0.628896682 |
| 95-th percentile | 1.706283364 |
| Maximum | 5.692614089 |
| Range | 8.51135479 |
| Interquartile range (IQR) | 1.346733353 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | -3.517338213 × 1015 |
| Kurtosis | 1.269831833 |
| Mean | -2.843948411 × 10-16 |
| Median Absolute Deviation (MAD) | 0.6464320094 |
| Skewness | 0.4745800227 |
| Sum | -3.907985047 × 10-13 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.4134193456 | 47 | 2.9% |
| -0.1252739956 | 46 | 2.9% |
| -0.5023593344 | 43 | 2.7% |
| 0.3595500115 | 39 | 2.4% |
| 0.3056806774 | 38 | 2.4% |
| -0.8794446732 | 38 | 2.4% |
| -0.6639673367 | 37 | 2.3% |
| -0.7717060049 | 35 | 2.2% |
| -0.1791433297 | 35 | 2.2% |
| -0.7178366708 | 35 | 2.2% |
| Other values (134) | 1206 |
| Value | Count | Frequency (%) |
| -2.818740701 | 14 | |
| -2.172308692 | 3 | 0.2% |
| -1.956831355 | 2 | 0.1% |
| -1.849092687 | 10 | |
| -1.795223353 | 2 | 0.1% |
| -1.741354019 | 3 | 0.2% |
| -1.687484685 | 6 | |
| -1.633615351 | 6 | |
| -1.579746017 | 5 | 0.3% |
| -1.525876683 | 13 |
| Value | Count | Frequency (%) |
| 5.692614089 | 1 | 0.1% |
| 4.345880736 | 2 | |
| 3.861056729 | 1 | 0.1% |
| 3.564775391 | 1 | 0.1% |
| 3.537840724 | 1 | 0.1% |
| 3.268494054 | 1 | 0.1% |
| 3.187690053 | 1 | 0.1% |
| 3.053016717 | 1 | 0.1% |
| 2.945278049 | 1 | 0.1% |
| 2.783670047 | 3 |
| Distinct | 80 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -3.110568574 × 10-17 |
| Minimum | -1.391472278 |
|---|---|
| Maximum | 3.743573932 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 852 |
| Negative (%) | 53.3% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.391472278 |
|---|---|
| 5-th percentile | -1.391472278 |
| Q1 | -0.9293181188 |
| median | -0.05636026327 |
| Q3 | 0.7652471302 |
| 95-th percentile | 1.689555448 |
| Maximum | 3.743573932 |
| Range | 5.135046209 |
| Interquartile range (IQR) | 1.694565249 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | -3.215852081 × 1016 |
| Kurtosis | -0.7889975154 |
| Mean | -3.110568574 × 10-17 |
| Median Absolute Deviation (MAD) | 0.8729578556 |
| Skewness | 0.3183372953 |
| Sum | -1.207922651 × 10-13 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -1.391472278 | 132 | 8.3% |
| 1.124700365 | 68 | 4.3% |
| -0.1590611875 | 51 | 3.2% |
| -1.288771353 | 50 | 3.1% |
| -0.05636026327 | 38 | 2.4% |
| -0.8779676567 | 35 | 2.2% |
| -0.9806685809 | 33 | 2.1% |
| -0.3131125737 | 33 | 2.1% |
| -1.340121816 | 33 | 2.1% |
| 0.2517425093 | 32 | 2.0% |
| Other values (70) | 1094 |
| Value | Count | Frequency (%) |
| -1.391472278 | 132 | |
| -1.340121816 | 33 | 2.1% |
| -1.288771353 | 50 | 3.1% |
| -1.237420891 | 30 | 1.9% |
| -1.186070429 | 29 | 1.8% |
| -1.134719967 | 20 | 1.3% |
| -1.083369505 | 24 | 1.5% |
| -1.032019043 | 22 | 1.4% |
| -0.9806685809 | 33 | 2.1% |
| -0.9293181188 | 30 | 1.9% |
| Value | Count | Frequency (%) |
| 3.743573932 | 1 | 0.1% |
| 2.665214228 | 1 | 0.1% |
| 2.613863766 | 1 | 0.1% |
| 2.511162841 | 3 | |
| 2.459812379 | 1 | 0.1% |
| 2.408461917 | 4 | |
| 2.357111455 | 3 | |
| 2.305760993 | 1 | 0.1% |
| 2.254410531 | 1 | 0.1% |
| 2.203060069 | 2 |
| Distinct | 92 |
|---|---|
| Distinct (%) | 5.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -6.387774751 × 10-17 |
| Minimum | -1.766791648 |
|---|---|
| Maximum | 9.099651083 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1164 |
| Negative (%) | 72.8% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.766791648 |
|---|---|
| 5-th percentile | -0.7152004157 |
| Q1 | -0.4347760872 |
| median | -0.2244578408 |
| Q3 | 0.05596648771 |
| 95-th percentile | 1.808618541 |
| Maximum | 9.099651083 |
| Range | 10.86644273 |
| Interquartile range (IQR) | 0.4907425749 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | -1.565980144 × 1016 |
| Kurtosis | 27.44721975 |
| Mean | -6.387774751 × 10-17 |
| Median Absolute Deviation (MAD) | 0.2103182464 |
| Skewness | 4.377839208 |
| Sum | -6.838973832 × 10-14 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.3646700051 | 156 | 9.8% |
| -0.2244578408 | 131 | 8.2% |
| -0.5048821694 | 129 | 8.1% |
| -0.294563923 | 128 | 8.0% |
| -0.4347760872 | 117 | 7.3% |
| -0.1543517587 | 109 | 6.8% |
| -0.08424567656 | 84 | 5.3% |
| 0.05596648771 | 79 | 4.9% |
| -0.5749882515 | 76 | 4.8% |
| -0.01413959443 | 74 | 4.6% |
| Other values (82) | 516 |
| Value | Count | Frequency (%) |
| -1.766791648 | 12 | 0.8% |
| -1.135836909 | 2 | 0.1% |
| -0.9255186621 | 8 | 0.5% |
| -0.85541258 | 5 | 0.3% |
| -0.7853064979 | 35 | |
| -0.7152004157 | 30 | 1.9% |
| -0.6450943336 | 58 | |
| -0.6100412926 | 2 | 0.1% |
| -0.5749882515 | 76 | |
| -0.5399352104 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 9.099651083 | 1 | |
| 9.029545001 | 2 | |
| 7.977953769 | 1 | |
| 7.907847687 | 2 | |
| 7.627423358 | 1 | |
| 7.276892947 | 1 | |
| 5.944877387 | 2 | |
| 5.73455914 | 1 | |
| 4.542755744 | 1 | |
| 4.472649662 | 1 |
| Distinct | 153 |
|---|---|
| Distinct (%) | 9.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.554935513 × 10-16 |
| Minimum | -1.603944891 |
|---|---|
| Maximum | 11.12703455 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1125 |
| Negative (%) | 70.4% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.603944891 |
|---|---|
| 5-th percentile | -0.7112885694 |
| Q1 | -0.3712290183 |
| median | -0.1799455208 |
| Q3 | 0.05384542064 |
| 95-th percentile | 0.8211047829 |
| Maximum | 11.12703455 |
| Range | 12.73097945 |
| Interquartile range (IQR) | 0.4250744389 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 2.813870571 × 1015 |
| Kurtosis | 41.71578725 |
| Mean | 3.554935513 × 10-16 |
| Median Absolute Deviation (MAD) | 0.2125372194 |
| Skewness | 5.680346572 |
| Sum | 5.69322367 × 10-13 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.1586917988 | 66 | 4.1% |
| -0.2862141305 | 55 | 3.4% |
| -0.2437066866 | 51 | 3.2% |
| -0.2011992427 | 51 | 3.2% |
| -0.07367691103 | 49 | 3.1% |
| -0.3499752963 | 47 | 2.9% |
| -0.2224529646 | 47 | 2.9% |
| -0.1161843549 | 46 | 2.9% |
| -0.2649604085 | 45 | 2.8% |
| -0.1799455208 | 43 | 2.7% |
| Other values (143) | 1099 |
| Value | Count | Frequency (%) |
| -1.603944891 | 2 | 0.1% |
| -1.136363008 | 1 | 0.1% |
| -1.051348121 | 2 | 0.1% |
| -1.030094399 | 4 | |
| -0.9875869547 | 4 | |
| -0.9663332327 | 3 | |
| -0.9450795108 | 1 | 0.1% |
| -0.9238257888 | 5 | |
| -0.9025720669 | 4 | |
| -0.8813183449 | 4 |
| Value | Count | Frequency (%) |
| 11.12703455 | 1 | 0.1% |
| 11.10578083 | 1 | 0.1% |
| 8.066498594 | 1 | 0.1% |
| 8.002737428 | 1 | 0.1% |
| 7.110081106 | 1 | 0.1% |
| 6.961305053 | 3 | |
| 6.940051331 | 2 | |
| 6.918797609 | 1 | 0.1% |
| 6.706260389 | 1 | 0.1% |
| 6.663752946 | 1 | 0.1% |
free sulfur dioxide
Real number (ℝ)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 60 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -6.498866486 × 10-17 |
| Minimum | -1.422500195 |
|---|---|
| Maximum | 5.367284318 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 924 |
| Negative (%) | 57.8% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.422500195 |
|---|---|
| 5-th percentile | -1.135607892 |
| Q1 | -0.8487155883 |
| median | -0.1793002138 |
| Q3 | 0.4901151607 |
| 95-th percentile | 1.82894591 |
| Maximum | 5.367284318 |
| Range | 6.789784513 |
| Interquartile range (IQR) | 1.338830749 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | -1.539211252 × 1016 |
| Kurtosis | 2.023562046 |
| Mean | -6.498866486 × 10-17 |
| Median Absolute Deviation (MAD) | 0.6694153745 |
| Skewness | 1.250567293 |
| Sum | -1.199040867 × 10-13 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.9443463561 | 138 | 8.6% |
| -1.039977124 | 104 | 6.5% |
| -0.5618232849 | 79 | 4.9% |
| -0.08366944599 | 78 | 4.9% |
| -0.3705617493 | 75 | 4.7% |
| -0.8487155883 | 71 | 4.4% |
| -0.6574540527 | 62 | 3.9% |
| 0.0119613218 | 61 | 3.8% |
| 0.1075920896 | 60 | 3.8% |
| -0.4661925171 | 59 | 3.7% |
| Other values (50) | 812 |
| Value | Count | Frequency (%) |
| -1.422500195 | 3 | 0.2% |
| -1.326869427 | 1 | 0.1% |
| -1.231238659 | 49 | 3.1% |
| -1.135607892 | 41 | 2.6% |
| -1.039977124 | 104 | |
| -0.9921617399 | 1 | 0.1% |
| -0.9443463561 | 138 | |
| -0.8487155883 | 71 | |
| -0.7530848205 | 56 | |
| -0.6574540527 | 62 |
| Value | Count | Frequency (%) |
| 5.367284318 | 1 | 0.1% |
| 4.984761247 | 2 | |
| 4.793499711 | 1 | 0.1% |
| 3.932822801 | 1 | 0.1% |
| 3.741561265 | 2 | |
| 3.645930498 | 1 | 0.1% |
| 3.55029973 | 1 | 0.1% |
| 3.454668962 | 3 | |
| 3.359038194 | 4 | |
| 3.263407427 | 2 |
total sulfur dioxide
Real number (ℝ)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 144 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.109871027 × 10-17 |
| Minimum | -1.23058377 |
|---|---|
| Maximum | 7.37515394 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 976 |
| Negative (%) | 61.0% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.23058377 |
|---|---|
| 5-th percentile | -1.078538934 |
| Q1 | -0.744040295 |
| median | -0.2574968202 |
| Q3 | 0.4723183919 |
| 95-th percentile | 1.995807647 |
| Maximum | 7.37515394 |
| Range | 8.60573771 |
| Interquartile range (IQR) | 1.216358687 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 1.406935285 × 1016 |
| Kurtosis | 3.809824488 |
| Mean | 7.109871027 × 10-17 |
| Median Absolute Deviation (MAD) | 0.5473614091 |
| Skewness | 1.515531258 |
| Sum | 7.105427358 × 10-14 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.5615864919 | 43 | 2.7% |
| -0.6832223606 | 36 | 2.3% |
| -0.9569030652 | 35 | 2.2% |
| -0.8656761637 | 35 | 2.2% |
| -0.7136313278 | 34 | 2.1% |
| -0.9873120323 | 33 | 2.1% |
| -0.8048582293 | 33 | 2.1% |
| -0.4703595904 | 32 | 2.0% |
| -0.2574968202 | 31 | 1.9% |
| -0.5919954591 | 30 | 1.9% |
| Other values (134) | 1257 |
| Value | Count | Frequency (%) |
| -1.23058377 | 3 | 0.2% |
| -1.200174803 | 4 | 0.3% |
| -1.169765835 | 14 | 0.9% |
| -1.139356868 | 14 | 0.9% |
| -1.108947901 | 27 | |
| -1.078538934 | 26 | |
| -1.048129967 | 29 | |
| -1.017721 | 28 | |
| -0.9873120323 | 33 | |
| -0.9569030652 | 35 |
| Value | Count | Frequency (%) |
| 7.37515394 | 1 | |
| 7.040655301 | 1 | |
| 3.604442011 | 1 | |
| 3.452397175 | 1 | |
| 3.300352339 | 1 | |
| 3.239534405 | 1 | |
| 3.209125437 | 1 | |
| 3.17871647 | 2 | |
| 3.117898536 | 1 | |
| 3.087489569 | 2 |
| Distinct | 436 |
|---|---|
| Distinct (%) | 27.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -3.470505795 × 10-14 |
| Minimum | -3.538731398 |
|---|---|
| Maximum | 3.680055125 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 798 |
| Negative (%) | 49.9% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -3.538731398 |
|---|---|
| 5-th percentile | -1.668843083 |
| Q1 | -0.607755666 |
| median | 0.001760083001 |
| Q3 | 0.5768249418 |
| 95-th percentile | 1.724304591 |
| Maximum | 3.680055125 |
| Range | 7.218786523 |
| Interquartile range (IQR) | 1.184580608 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | -2.882325809 × 1013 |
| Kurtosis | 0.9340790655 |
| Mean | -3.470505795 × 10-14 |
| Median Absolute Deviation (MAD) | 0.5989154751 |
| Skewness | 0.07128766295 |
| Sum | -5.551933913 × 10-11 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.2402662457 | 36 | 2.3% |
| 0.02826076774 | 35 | 2.2% |
| 0.4522717236 | 35 | 2.2% |
| 0.6642772015 | 29 | 1.8% |
| -0.2897474491 | 28 | 1.8% |
| 0.5582744625 | 26 | 1.6% |
| -0.1837447102 | 25 | 1.6% |
| 1.406296374 | 24 | 1.5% |
| 0.1342635067 | 24 | 1.5% |
| 0.7702799404 | 23 | 1.4% |
| Other values (426) | 1314 |
| Value | Count | Frequency (%) |
| -3.538731398 | 2 | |
| -3.469829618 | 1 | |
| -3.236623592 | 2 | |
| -3.151821401 | 1 | |
| -3.130620853 | 1 | |
| -2.939815923 | 1 | |
| -2.780811815 | 1 | |
| -2.759611267 | 1 | |
| -2.743710856 | 1 | |
| -2.727810445 | 2 |
| Value | Count | Frequency (%) |
| 3.680055125 | 2 | |
| 3.420348414 | 1 | 0.1% |
| 3.39384773 | 3 | |
| 3.256044169 | 1 | 0.1% |
| 3.102340198 | 2 | |
| 3.006937732 | 2 | |
| 2.89033472 | 2 | |
| 2.83733335 | 2 | |
| 2.678329242 | 1 | 0.1% |
| 2.519325133 | 2 |
| Distinct | 90 |
|---|---|
| Distinct (%) | 5.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.987153406 × 10-15 |
| Minimum | -3.713692856 |
|---|---|
| Maximum | 4.53570697 |
| Zeros | 5 |
| Zeros (%) | 0.3% |
| Negative | 817 |
| Negative (%) | 51.1% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -3.713692856 |
|---|---|
| 5-th percentile | -1.635103923 |
| Q1 | -0.6607653611 |
| median | -0.01120631972 |
| Q3 | 0.5733968175 |
| 95-th percentile | 1.677647188 |
| Maximum | 4.53570697 |
| Range | 8.249399826 |
| Interquartile range (IQR) | 1.234162179 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 5.033898435 × 1014 |
| Kurtosis | 0.8342418229 |
| Mean | 1.987153406 × 10-15 |
| Median Absolute Deviation (MAD) | 0.6495590414 |
| Skewness | 0.1895960525 |
| Sum | 3.18101101 × 10-12 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.07616222386 | 57 | 3.6% |
| 0.313573201 | 56 | 3.5% |
| -0.3359858404 | 53 | 3.3% |
| 0.5084409134 | 48 | 3.0% |
| 0.4434850093 | 48 | 3.0% |
| -0.141118128 | 46 | 2.9% |
| 0.05374958442 | 45 | 2.8% |
| 0.1836613927 | 43 | 2.7% |
| -0.2060740321 | 42 | 2.6% |
| -0.7257212653 | 39 | 2.4% |
| Other values (80) | 1122 |
| Value | Count | Frequency (%) |
| -3.713692856 | 1 | 0.1% |
| -2.934222006 | 1 | 0.1% |
| -2.869266102 | 1 | 0.1% |
| -2.804310198 | 2 | |
| -2.739354294 | 4 | |
| -2.674398389 | 1 | 0.1% |
| -2.544486581 | 4 | |
| -2.479530677 | 3 | |
| -2.414574773 | 4 | |
| -2.349618869 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 4.53570697 | 2 | |
| 3.821192024 | 2 | |
| 3.496412504 | 1 | 0.1% |
| 3.041721175 | 2 | |
| 2.846853462 | 1 | 0.1% |
| 2.781897558 | 1 | 0.1% |
| 2.65198575 | 3 | |
| 2.587029846 | 4 | |
| 2.522073942 | 1 | 0.1% |
| 2.457118038 | 4 |
| Distinct | 96 |
|---|---|
| Distinct (%) | 6.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.754377476 × 10-16 |
| Minimum | -1.936507291 |
|---|---|
| Maximum | 7.918676552 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 964 |
| Negative (%) | 60.3% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.936507291 |
|---|---|
| 5-th percentile | -1.110324214 |
| Q1 | -0.638219599 |
| median | -0.2251280607 |
| Q3 | 0.4240157852 |
| 95-th percentile | 1.604277323 |
| Maximum | 7.918676552 |
| Range | 9.855183842 |
| Interquartile range (IQR) | 1.062235384 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 1.480984511 × 1015 |
| Kurtosis | 11.72025073 |
| Mean | 6.754377476 × 10-16 |
| Median Absolute Deviation (MAD) | 0.4721046152 |
| Skewness | 2.428672354 |
| Sum | 1.080024958 × 10-12 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.3431542145 | 69 | 4.3% |
| -0.4611803683 | 68 | 4.3% |
| -0.6972326759 | 68 | 4.3% |
| -0.2251280607 | 61 | 3.8% |
| -0.5792065221 | 60 | 3.8% |
| -0.5201934452 | 55 | 3.4% |
| -0.7562457528 | 51 | 3.2% |
| -0.4021672914 | 51 | 3.2% |
| -0.638219599 | 50 | 3.1% |
| -0.1661149838 | 48 | 3.0% |
| Other values (86) | 1018 |
| Value | Count | Frequency (%) |
| -1.936507291 | 1 | 0.1% |
| -1.700454983 | 2 | 0.1% |
| -1.582428829 | 6 | 0.4% |
| -1.523415753 | 4 | 0.3% |
| -1.405389599 | 5 | 0.3% |
| -1.346376522 | 8 | |
| -1.287363445 | 16 | |
| -1.228350368 | 12 | |
| -1.169337291 | 18 | |
| -1.110324214 | 19 |
| Value | Count | Frequency (%) |
| 7.918676552 | 1 | 0.1% |
| 7.800650398 | 1 | 0.1% |
| 7.623611167 | 2 | |
| 5.676179629 | 1 | 0.1% |
| 5.617166552 | 1 | 0.1% |
| 5.499140399 | 1 | 0.1% |
| 5.322101168 | 1 | 0.1% |
| 4.14183963 | 3 | |
| 4.023813476 | 1 | 0.1% |
| 3.964800399 | 1 | 0.1% |
| Distinct | 65 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.299426135 × 10-16 |
| Minimum | -1.898918597 |
|---|---|
| Maximum | 4.202452586 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 916 |
| Negative (%) | 57.3% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -1.898918597 |
|---|---|
| 5-th percentile | -1.147980605 |
| Q1 | -0.8663788579 |
| median | -0.2093081152 |
| Q3 | 0.6354971255 |
| 95-th percentile | 1.949638611 |
| Maximum | 4.202452586 |
| Range | 6.101371183 |
| Interquartile range (IQR) | 1.501875983 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 7.698112381 × 1015 |
| Kurtosis | 0.2000293113 |
| Mean | 1.299426135 × 10-16 |
| Median Absolute Deviation (MAD) | 0.6570707427 |
| Skewness | 0.8608288069 |
| Sum | 1.580402476 × 10-13 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.8663788579 | 139 | 8.7% |
| -0.9602461069 | 103 | 6.4% |
| -0.584777111 | 78 | 4.9% |
| -1.147980605 | 72 | 4.5% |
| -0.3970426131 | 67 | 4.2% |
| 0.07229363171 | 67 | 4.2% |
| -1.054113356 | 59 | 3.7% |
| 0.5416298765 | 59 | 3.7% |
| -0.772511609 | 59 | 3.7% |
| -0.67864436 | 54 | 3.4% |
| Other values (55) | 842 |
| Value | Count | Frequency (%) |
| -1.898918597 | 2 | 0.1% |
| -1.805051348 | 1 | 0.1% |
| -1.61731685 | 2 | 0.1% |
| -1.523449601 | 2 | 0.1% |
| -1.335715103 | 30 | |
| -1.288781478 | 1 | 0.1% |
| -1.241847854 | 23 | 1.4% |
| -1.147980605 | 72 | |
| -1.116691522 | 1 | 0.1% |
| -1.10104698 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 4.202452586 | 1 | 0.1% |
| 3.357647345 | 7 | |
| 2.98217835 | 4 | |
| 2.950889267 | 1 | 0.1% |
| 2.888311101 | 1 | 0.1% |
| 2.794443852 | 3 | |
| 2.700576603 | 3 | |
| 2.606709354 | 1 | 0.1% |
| 2.512842105 | 2 | 0.1% |
| 2.418974856 | 6 |
quality
Real number (ℝ)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.443669392 × 10-17 |
| Minimum | -3.265164633 |
|---|---|
| Maximum | 2.928190347 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 744 |
| Negative (%) | 46.5% |
| Memory size | 12.6 KiB |
Quantile statistics
| Minimum | -3.265164633 |
|---|---|
| 5-th percentile | -0.7878226409 |
| Q1 | -0.7878226409 |
| median | 0.450848355 |
| Q3 | 0.450848355 |
| 95-th percentile | 1.689519351 |
| Maximum | 2.928190347 |
| Range | 6.19335498 |
| Interquartile range (IQR) | 1.238670996 |
Descriptive statistics
| Standard deviation | 1.000312842 |
|---|---|
| Coefficient of variation (CV) | 2.251096456 × 1016 |
| Kurtosis | 0.2967081198 |
| Mean | 4.443669392 × 10-17 |
| Median Absolute Deviation (MAD) | 1.238670996 |
| Skewness | 0.2178015755 |
| Sum | 7.105427358 × 10-14 |
| Variance | 1.000625782 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) |
| -0.7878226409 | 681 | |
| 0.450848355 | 638 | |
| 1.689519351 | 199 | 12.4% |
| -2.026493637 | 53 | 3.3% |
| 2.928190347 | 18 | 1.1% |
| -3.265164633 | 10 | 0.6% |
| Value | Count | Frequency (%) |
| -3.265164633 | 10 | 0.6% |
| -2.026493637 | 53 | 3.3% |
| -0.7878226409 | 681 | |
| 0.450848355 | 638 | |
| 1.689519351 | 199 | 12.4% |
| 2.928190347 | 18 | 1.1% |
| Value | Count | Frequency (%) |
| 2.928190347 | 18 | 1.1% |
| 1.689519351 | 199 | 12.4% |
| 0.450848355 | 638 | |
| -0.7878226409 | 681 | |
| -2.026493637 | 53 | 3.3% |
| -3.265164633 | 10 | 0.6% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -0.528360 | 0.952113 | -1.391472 | -0.434776 | -0.243707 | -0.466193 | -0.379133 | 0.558274 | 1.287912 | -0.579207 | -0.960246 | -0.787823 |
| 1 | -0.298547 | 1.921761 | -1.391472 | 0.055966 | 0.223875 | 0.872638 | 0.624363 | 0.028261 | -0.725721 | 0.128950 | -0.584777 | -0.787823 |
| 2 | -0.298547 | 1.275329 | -1.186070 | -0.154352 | 0.096353 | -0.083669 | 0.229047 | 0.134264 | -0.335986 | -0.048089 | -0.584777 | -0.787823 |
| 3 | 1.654856 | -1.310399 | 1.484154 | -0.434776 | -0.264960 | 0.107592 | 0.411500 | 0.664277 | -0.985545 | -0.461180 | -0.584777 | 0.450848 |
| 4 | -0.528360 | 0.952113 | -1.391472 | -0.434776 | -0.243707 | -0.466193 | -0.379133 | 0.558274 | 1.287912 | -0.579207 | -0.960246 | -0.787823 |
| 5 | -0.528360 | 0.736635 | -1.391472 | -0.504882 | -0.264960 | -0.274931 | -0.196679 | 0.558274 | 1.287912 | -0.579207 | -0.960246 | -0.787823 |
| 6 | -0.241094 | 0.413419 | -1.083370 | -0.645094 | -0.392483 | -0.083669 | 0.381091 | -0.183745 | -0.076162 | -1.169337 | -0.960246 | -0.787823 |
| 7 | -0.585813 | 0.682766 | -1.391472 | -0.925519 | -0.477498 | -0.083669 | -0.774449 | -1.137769 | 0.508441 | -1.110324 | -0.397043 | 1.689519 |
| 8 | -0.298547 | 0.305681 | -1.288771 | -0.364670 | -0.307468 | -0.657454 | -0.865676 | 0.028261 | 0.313573 | -0.520193 | -0.866379 | 1.689519 |
| 9 | -0.470907 | -0.125274 | 0.457144 | 2.509679 | -0.349975 | 0.107592 | 1.688677 | 0.558274 | 0.248617 | 0.837107 | 0.072294 | -0.787823 |
Last rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1589 | -0.987984 | 1.086786 | -0.364463 | 3.701483 | -0.307468 | 1.255161 | 0.989271 | 0.505273 | -0.141118 | -0.697233 | -1.147981 | -0.787823 |
| 1590 | -1.160343 | 0.144073 | -0.621215 | -0.504882 | -0.222453 | 0.968269 | -0.348724 | -1.911589 | 0.053750 | 0.955133 | 1.104833 | 0.450848 |
| 1591 | -1.677421 | 1.167590 | -0.929318 | -0.574988 | 0.032592 | 0.011961 | -0.622404 | -1.445177 | 2.327206 | -0.579207 | 1.104833 | 0.450848 |
| 1592 | -1.160343 | -0.071405 | -0.723916 | -0.154352 | -0.243707 | 1.255161 | -0.196679 | -0.533554 | 0.703309 | 0.542042 | 0.541630 | 0.450848 |
| 1593 | -0.873078 | 0.521158 | -0.980669 | -0.434776 | -0.413736 | 1.159531 | -0.257497 | -0.125443 | 0.703309 | 0.955133 | -0.866379 | 0.450848 |
| 1594 | -1.217796 | 0.413419 | -0.980669 | -0.364670 | 0.053845 | 1.542054 | -0.075043 | -0.978765 | 0.898176 | -0.461180 | 0.072294 | -0.787823 |
| 1595 | -1.390155 | 0.144073 | -0.877968 | -0.224458 | -0.541259 | 2.211469 | 0.137820 | -0.862162 | 1.352868 | 0.601055 | 0.729364 | 0.450848 |
| 1596 | -1.160343 | -0.071405 | -0.723916 | -0.154352 | -0.243707 | 1.255161 | -0.196679 | -0.533554 | 0.703309 | 0.542042 | 0.541630 | 0.450848 |
| 1597 | -1.390155 | 0.655831 | -0.775267 | -0.364670 | -0.264960 | 1.542054 | -0.075043 | -0.676657 | 1.677647 | 0.305990 | -0.209308 | -0.787823 |
| 1598 | -1.332702 | -1.148791 | 1.021999 | 0.757027 | -0.434990 | 0.203223 | -0.135861 | -0.666057 | 0.508441 | 0.010924 | 0.541630 | 0.450848 |
Most frequently occurring
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 21 | -0.930531 | -0.340751 | -0.159061 | -0.574988 | -0.222453 | 0.203223 | -0.379133 | -1.031767 | 0.508441 | -0.343154 | 0.166161 | 0.450848 | 4 |
| 51 | -0.643266 | -0.879445 | 0.970649 | -0.294564 | -0.286214 | 0.777007 | -0.075043 | -0.745559 | 0.573397 | 1.132173 | 0.541630 | 1.689519 | 4 |
| 61 | -0.643266 | 0.925178 | -0.723916 | -0.364670 | -0.243707 | -0.370562 | -0.804858 | -0.681958 | -0.141118 | -0.697233 | -0.303175 | -0.787823 | 4 |
| 78 | -0.470907 | -0.071405 | -1.288771 | -0.574988 | -0.073677 | -0.274931 | -0.470360 | -0.724359 | 0.313573 | -0.697233 | 0.072294 | 0.450848 | 4 |
| 5 | -1.332702 | -0.125274 | -1.391472 | -0.785306 | -0.647527 | -0.083669 | -0.622404 | -1.201371 | 0.313573 | -1.228350 | -0.866379 | -0.787823 | 3 |
| 12 | -1.102890 | 0.628897 | -0.313113 | -0.504882 | -0.137438 | -0.179300 | -0.470360 | 0.075962 | 1.807559 | 0.010924 | -0.584777 | -0.787823 | 3 |
| 37 | -0.758172 | 0.682766 | -1.288771 | -0.294564 | -0.456244 | -0.753085 | -0.652813 | 0.240266 | 1.028088 | 0.069937 | -0.866379 | 0.450848 | 3 |
| 38 | -0.758172 | 0.898243 | -1.032019 | -0.014140 | 0.075099 | -0.083669 | -0.774449 | -0.544154 | 0.443485 | -0.343154 | 0.823232 | 0.450848 | 3 |
| 58 | -0.643266 | 0.575027 | -1.391472 | -0.434776 | 0.202621 | -0.179300 | -0.257497 | 0.001760 | 0.378529 | -0.461180 | -1.335715 | 0.450848 | 3 |
| 101 | -0.298547 | 0.413419 | -0.056360 | -0.364670 | -0.158692 | 1.446423 | 2.570537 | -0.279147 | -0.660765 | -0.815259 | -0.490910 | -0.787823 | 3 |